Statistical literacy: A new mission for data producers

نویسنده

  • Milo Schield
چکیده

Statistical literacy is a new goal for statistical educators. A core element of statistical literacy for consumers is the ability to read and interpret data in the tables and graphs published by national statistical offices. A core element for producers is the ability to create tables, graphs and reports that are unambiguous and comprehensible. It appears that comprehensibility is not considered part of the mission for many national statistical offices (NSOs). Yet can the staff or users read the data generated by these agencies? The 2002 W. M. Keck Statistical Literacy survey indicates that professional data analysts, college students and school mathematics teachers and even college professors have difficulties reading such data. A common reason is confusing captions. Other reasons include user difficulties in decoding tables and in using ordinary English to describe part-whole relations. Recommendations include vetting agency tables and graphs for comprehensibility, assessing the statistical literacy of staff and users, and developing objective standard standards for using ordinary English to describe rates and percentages and for titling such tables. Establishing these standards can help teachers improve the statistical literacy of students and future leaders so they can use agency-generated data to make better decisions. 1. Statistical literacy While the phrase “statistical literacy” has a long history, it has only recently become a goal for statistical educators. In 1979, “statistical literacy” was the title of a textbook [11]. In 1982, “statistical numeracy”was described in the Cockcroft report [8]. In 2001, “statistical literacy”was an IASE conference theme. In 2002, “Developing a statistically literate society” was the theme of the International Conference on Teaching Statistics (ICOTS-6). In 2006, statistical literacy was adopted as a goal by the American Statistical Association (ASA) in endorsing the ASA Guidelines for Assessment and Instruction in Statistics Education (GAISE) [1]. This goal is stated in the first sentence in the PreK-12 portion of the GAISE report, “The ultimate goal: Statistical Literacy,” and in the first recommendation of the College GAISE report: “introductory courses in statistics should, as much as possible, strive to emphasize statistical literacy and develop statistical thinking . . . ” A “statistically literate society” is a goal in the ASA Strategic Plan for Education [2]: “Through leadership in all levels of statistical education, the ASA can help to build a statistically literate society . . . ” Assessing statistical literacy is reviewed in recent articles [5,10, 25]. The increased attention to statistical literacy does not mean there is clear agreement on its definition [25]. A lack of agreement on the definition or its relevance to the IASE mission may explain the omission of “statistical literacy” in subsequent IASE conferences. It may be that “statistical literacy” is just a buzz-word that lacks substance or staying power. Yet a 2009 MAA survey of US four-year colleges found that 17% offered a Statistical Literacy course [24]. But analyzing differences in definition and approach can be bypassed – and entanglementwith a potential fad can be avoided – if there is agreement onwhat statistical literacy involves. Gal’s statement clearly identifies a key element: – statistical literacy involves the ability to read and interpret the data in tables and graphs published by government statistical associations [9]. This measure of statistical literacy for data consumers requires that tables and graphs produced by govern1874-7655/11/$27.50  2011 – IOS Press and the authors. All rights reserved 174 M. Schield / Statistical literacy: A new mission for data producers ment statistical associations are unambiguous, clear and comprehensible. Producing such tables and graphs requires a high level of fluency in data presentation by these agencies. 2. Missions of national statistical offices So how does statistical literacy or fluency in data presentation relate to the mission of national statistical offices? Consider these mission statements by the U.S. Census Bureau: – Mission: “to be the preeminent collector and provider of timely, relevant and quality data about the people and economy of the United States.” Goal: “to provide the best mix of timeliness, relevancy, quality and cost for the data we collect and services we provide.” [32] – Mission: “The Census Bureau serves as the leading source of quality data about the nation’s people and economy. We honor privacy, protect confidentiality, share our expertise globally, and conduct our work openly. The production of high quality, relevant statistical information rests on principles that the Census Bureau holds dear. Openness to user and respondent concerns, independence and neutrality, strong statistical standards, and protection of confidentiality form the foundation for the work we do.” [33] Assessing or enhancing the comprehensibility of their tables and graphs is not listed as a high-level goal for this agency. Now consider the aims of the UK Office of National Statistics [27]. – to provide authoritative, timely and accessible statistics and analysis that enable decision making across UK society, anticipate needs and support public accountability – to be a trusted and leading supplier of national government statistical expertise and surveys – to maintain a dynamic portfolio of statistical sources, which reflects changing data needs – to deliver the sources portfolio in a way that meets user expectations of quality within the available resources – to minimise the burden on respondents for all survey collections – our people, systems and processes are able to develop the current business and to respond rapidly to changing demands – to identify social and technological changes that will impact on what we do and how we do it These aims mention providing statistics that “enable decision making” but make no mention of measuring the comprehensibility of their statistics. Yet banks and insurers are required to ensure that their forms and publications meet certain objective standards for comprehensibility. Consider the 2008 IAOS conference: Reshaping Official Statistics [13]. Areas covered included: “Use of administrative data in the statistical system, Use of administrative data in official statistics, Challenges of building register based or other administrative based statistics, More efficient use of statistical data, Questionnaire design and testing, User demands for official statistics, Electronic reporting, and Process orientated statistical production.” Aside from possible “user demands,” it appears that the comprehensibility of data was not a significant agenda item at this conference. Questionnaires may undergo extensive testing, but is there any testing of whether the tables and graphs produced are comprehensible by the general public? Yet there are signs that comprehensibility may be emerging as a goal for some NSOs. – In 2008, the International Statistical Literacy Project (ISLP) presented “programs of some National Statistical Offices (NSOs) whose only purpose is to increase the level of statistical literacy of the public.” “By a successful program, we mean a program that has reached the front page of the National Statistical Office web site, that is, a program that constitutes an intrinsic part of the general public.” [17] – The focus on projects, such as “the Census in Schools” project, indicates some support by NSOs for education and statistical literacy for consumers. – The Statistics Education Unit of the AustralianBureau of Statistics identified criteria for statistical literacy and presented statistical literacy competencies by grade in school [3]. In summary, statistical literacy does not appear as a high priority with many National Statistical Offices although this may be changing. Once NSOs view statistical literacy as central to their mission they can extend their production-styles missions (to generate data that is accurate, timely, and relevant to their user’s needs) to include market-driven missions: to generate accurate M. Schield / Statistical literacy: A new mission for data producers 175 Fig. 1. U.S. Death Rates for Injury by Firearms, Sex, Race and Age. and timely data that is comprehensible by and useful to decision makers. But are the tables and charts published by National Statistical Offices comprehensible? This is a critical question for NSOs that justify their existence by producing data that is supposedly useful in making decisions. Consider two groups of data consumers: (1) journalists, the staff of politicians, politicians who vote on legislation and on the NSOs’ budgets, leaders who make business and social decisions, and the general public, and (2) professional data analysts at NSOs, college professors, college students and school mathematics teachers. This paper presents data on the statistical literacy of the second group: professional data analysts, college professors, college students and school mathematics teachers. 3. Statistical illiteracy From 1998 to 2002, students taking Statistical Literacy at Augsburg College studied tables of rates and percentages presented in the U. S. Statistical Abstract. This exercise indicated that students had difficulties reading these tables. The Director at that time, Glenn King, provided copies of the U. S. Statistical Abstract for use by these students and participated in a preliminary survey to identify the level of statistical literacy in reading summary statistics presented in tables, graphs and statements. As an example, here are two tables from the U.S. Statistical Abstract that college students in non-quantitative majors found difficult to read. In Fig. 1, students found the title confusing [31]. They could see data classified by age, sex and race, but not by firearm. They did not realize that the “by” in “by firearms” was short for “caused by.” “Classified by” and “Caused by” may have been conflated to save space in the title. In Fig. 2, “by” means three things: “classified by”, “caused by” and “distributed by.” [31] Since rate tables seldom involve “by” as “distributed by”, students failed to see that the rows were parts (numerators). They mistakenly presumed the rows were wholes: pre-existing conditions which led to the different death rates. So they mistakenly said, “Among those with ‘diseases of the heart’, the death rate was 152 per 100,000 in 1990.” A correct statement would be, “The rate of death due to diseases of the heart was 152 per 100,000 population in 1990.” 4. 2002 statistical literacy survey In 2002, an international survey of statistical literacy was conducted by the W. M. Keck Statistical Literacy Project [21]. This survey focused entirely on informal statistics – the ability to describe and compare rates and percentages as presented in table, graphs and statements. Here are the average error rates for each of the four groups surveyed: college teachers (29%), professional data analysts (45%), college students (49%), and school mathematics teachers (55%). The actual survey is also available [20]. The Appendix contains the results for professional data analysts. Here are some highlights: – 30% had difficulty reading a simple 100% row table [Q23] – 43% were unable to identify an invalid comparison using data in a 100% row table. [Q28] – About half were unable to correctly classify descriptions of percentages in a two-way half table [Q30-35] or in one-way half tables [Q44-48, Q4950 and Q54-56] – About half were unable to correctly classify descriptions of rates in a rate table [Q60-63] 176 M. Schield / Statistical literacy: A new mission for data producers Fig. 2. U.S. age-adjusted death rates by selected causes. Fig. 3. Error rates by occupation and percentile. Note: these are error rates for professional analysts working for national statistical offices. Figure 3 presents the distribution of error rates by percentile within each of the four groups. Focusing on averages may draw attention away from the range of scores within each group. Specifically, the best scorers had error rates of 10% to 20%; the worst had error rates of 60% to 80%. These high error rates within each group indicate that statistical illiteracy is widespread. 5. Explaining statistical illiteracy One might expect that professional data analysts would do better than school teachers, and that school teachers would do better than college students. So why did school teachers average lower – and professional data analysts didn’t do much better – than college students? In Table 1, the lowest error rates are for college faculty (29%, top row) and for native English speakers (43%, left column). The highest error rates (shaded cells) for each row typically involve those who learned English after childhood (52%,right column). It appears that being a non-native English speaker may be a risk factor for statistical illiteracy. One reason that school teachers had lower scores than the college students is language: most school teachers were non-native speakers while most college students were native speakers. Language helps explain why professional data analysts had lower scores than college teachers: most data analysts were non-native speakers; most college teachers were native speakers. This distribution of respondents by English background may seem unexpected. In fact, all those labeled school teachers are school mathematics teachers in South Africa. Over half of those labeled data analysts are staff at the South African Statistical Society while the others are staff at the US Census Bureau and at STATS. While these are unusual groups or mixtures for those in first-world agencies, they may reflect some of the challenges for NSOs whose English-language statistics are being read by an increasing number of non-native speakers. M. Schield / Statistical literacy: A new mission for data producers 177 Table 1 Error rate (count) by occupation and english speaking background 6. The Jenkinson project As can be seen in the preceding examples and the results in Appendix A, the titles for tables and graphs can be ambiguous when describing rates and percentages. Consider these two sets of titles. Given the small changes in syntax within each set, do all the members of the following sets indicate the same part-whole relationship? – The suicide rate of males, the males’ rate of suicide and the male rate of suicide – The percentage of smokers who are men, the percentage of male smokers, the percentage of smokers among men In each case at least two members indicate different part-whole relations [18]. The moral: small changes in syntax can create big differences in semantics. Understanding these conditional relationships is crucial to understanding the various ways we have of dealing with context – of taking into account the influence of related factors. Of course these are subtleties compared to a simple reversal of part and whole as appeared in a 2009 AP story: Study says too much candy could lead to prison [26]. The AP stated that “Of the children who ate candies or chocolates daily at age 10, 69 percent were later arrested for a violent offense by the age of 34.” But the AP got it backward. The true statistic was the inverse: 69% of violent criminals ate candy daily at age 10. In 1949, Jenkinson [14] noted the difficulty of titling tables of percentages. He proposed “a search for a systematic presentation which focuses attention on basic problems in percentage description.” That was more than 60 years ago! This isn’t rocket science; it won’t cost billions or even millions. This Jenkinson “project” should be carried out if data producers are to avoid the charge of being statistically illiterate. Schield suggested some rules [18] but the producers of tables need to set their own standards. Some may think the use of the word “project” is rather grandiose. But based on my experience with NSOs, getting agreement on when to use percent and when to use percentage may prove to be a project all by itself. One might even argue the some of the statistical illiteracy of users is caused by the lack of fluency in data presentation by data producers: either the data producers can’t produce tables and charts that are comprehensible, or they don’t have objective standards that would guide users in reading their tables and charts. If NSOs would provide such standards, they could guide teachers in teaching students how to be statistically literate consumers. An alternate explanation is given by Lehohla [15], who noted the dramatic increase in interest in official statistics by non-professionals. Tables and charts that are readily comprehended by data professionals may appear ambiguous or incomprehensible to nonprofessional data consumers. 7. Training government employees When the Director of Statistics in South Africa, Pali Lehohla, realized that some of his staff had difficulty reading simple tables and graphs, he recognized that other government employees charged with making appropriate decisions using data were even more likely to have such difficulties. He wondered about the possibility of setting up training programs for all such government employees. This seemed like an impossible task in 2002 – given the ongoing lack of resources. But web-based solutions are emerging. Programs for improving statistical literacy include on-line delivery of multiple choice exercises [23], a web-program that analyzes a user’s ability to use ordinary English to describe and compare rates and percentages as presented in graphs, tables and statements [6,7] and an entire web-based, statistical literacy course for consumers [12] now offered at Augsburg College. 178 M. Schield / Statistical literacy: A new mission for data producers 8. Supporting policy goals National Statistical Offices may become involved in supporting policy goals. The United Nations Development Group has identified a number of Millennium Development Goals (MDGs). Achieving these goals requires enhancing the “statistical capacity and literacy across country partners in order to increase data availability and enhance data use and support evidencebased policy-making [28–30].” If NSOs become responsible for supporting policy goals as an integral part of their mission, they may also take on responsibility for ensuring that their data is useful to and understandable by policy makers. This may entail taking on responsibility for ensuring that policy makers and their staff are statistically literate consumers – that they have the necessary skills to read and interpret the data provided by the NSOs. 9. Other ways of supporting statistical literacy There are other ways to support or improve the statistical literacy of users besides improving the comprehensibility of tables and graphs. These include – Flagging averages where the distribution is bimodal. This may occur when some of the members have none of the characteristic being measured. E.g., the average US family spends more on pets than on alcohol; the average American is drinking less and working less. – Improving data accessibility: SomePDF tables are not designed to take advantage of the new Acrobat 9 commands (“Copy as a Table”, “Save as a Table” and “Open Table in Spreadsheet”) so that numeric appearing data is not actually available as numeric data. – Focusing attention on the importance of how the definition of a group can influence the size of a number. As Joel Best [4] noted, all statistics are socially constructed. Consider these examples: (a) OPEC countries supply 50% of US oil imports, but only 30% of US oil usage. (b) The average US farm is 440 acres; the average US family farm is 326 acres and (c) Annual income is $43K for households, $53K for families and $62K for married couples [22]. – Promoting multivariate thinking [16]: All too often data is classified by factors that are secondary – such as by geographic region – when other factors are more highly correlated with the outcome in question. In some cases, taking these other factors into account can increase, decrease or even reverse the observed association between the secondary factor and the outcome of interest. Helping users be aware of this possibility might enhance their statistical literacy and increase their trust in the veracity of the data when it appears to say different things depending on what is taken into account. But these all enhance usability and interpretationwhich seems to be secondary to enhancing the comprehensibility of the data presentation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Survey on Media Literacy of Radio and Television Program Producers (Case study: Mazandaran Radio and Television Center)

In the recent decades, advanced technologies in  production, infrastructure and devices for providing content and services have been provided. The audience can use the media of their interest at any time, with the desired devices .Accordingly, there are many differences in the level of media literacy of the people. This study seeks to determine the level of media literacy of the producer of pro...

متن کامل

عوامل موثر بر ارتقاء سواد رسانه‌ای برنامه‌سازان مجموعه‌های نمایشی سیما

Media literacy has always been emphasized by experts as a means of developing critical thinking, but it is not only pointed to the audience but also necessary for media producers. Ability to analyze message helps the content producers to have a better understanding of the production process. In this research, we pay attention to media literacy of IRIB drama producers based on the Knowledge stru...

متن کامل

Preparing for Diversity in Statistics Literacy: Institutional and Educational Implications

Improving the public's understanding of statistical information requires that producers or reporters of statistical messages are aware of: The nature of people's statistics literacy, The factors that affect the difficulty of statistics-related messages, The existence of individual or group differences in statistics literacy; and The information needs of different target audiences. Implications ...

متن کامل

Evaluation of Statistical Literacy among the Staff of Health Deputy in Shiraz University of Medical Sciences

Introduction: Regarding the significance of statistics in policy and decision making in health care system, the capability to read, understand, and interpret statistics and data is considered very important. Hence, this investigation aims at determining the statistical literacy of staff working in health deputy of Shiraz University of Medical Sciences. Methods: This descriptive cross-sectional...

متن کامل

Improving health literacy using social networks

Background and Aim: Technologies such as social networks have provided new opportunities for health and social interactions, and the promotion of attitudes such as the attitudes of health control and improving the health literacy in society. The purpose of this study is to determine the relationship between the use of social networks with the health literacy of Ilam public library users. Method...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011